No true association
Plaque Begets Plaque, ApoB Does Not
The Statistics Cause Doubt
John Slough
Plaque begets plaque: biology or mathematical artifact?
“All baseline plaque metrics (coronary artery calcium, NCPV, total plaque score, and percent atheroma volume) were strongly associated with the change in NCPV.”
Change in noncalcified plaque volume \(\Delta \text{NCPV}\) was the outcome:
\[ \Delta \text{NCPV} = \text{NCPV}_{1} - \text{NCPV}_0 \]
They regressed \(\Delta \text{NCPV}\) directly on its baseline value \(\text{NCPV}_0\):
\[ \Delta \text{NCPV} = \alpha + \beta \, \text{NCPV}_0 + \varepsilon \]
But this introduces mathematical coupling, because \(\text{NCPV}_0\) appears on both sides of the equation:
\[ \text{NCPV}_{1} - \text{NCPV}_0 = \alpha + \beta \, \text{NCPV}_0 + \varepsilon \]
Regression model: \(\text{NCPV}_{1} - \text{NCPV}_0 = \alpha + \beta \, \text{NCPV}_0 + \varepsilon\)
The regression coefficient (slope): \(\beta = \frac{\operatorname{Cov}((\text{NCPV}_1 - \text{NCPV}_0),\ \text{NCPV}_0)}{\operatorname{Var}(\text{NCPV}_0)} = \frac{\operatorname{Cov}(\Delta \text{NCPV},\ \text{NCPV}_0)}{\operatorname{Var}(\text{NCPV}_0)}\)
This simplifies to: \(\beta = \frac{\rho\, \sigma_1 - \sigma_0}{\sigma_0}\)
where:
With baseline NCPV contributing to both predictor and outcome, the slope captures an inseparable combination of mathematical coupling and possible biological change.
It is not a clean estimate of baseline influence.
From Oldham* (1962): \(\beta > 0 \quad\text{if}\quad \rho > \frac{\sigma_0}{\sigma_1}\)
The slope depends on:
So a positive or negative slope arises purely from the math.
In this study:
These conditions tend to bias the slope upward; whether the resulting positive association reflects biology, coupling, or both cannot be determined from this model.
*Source: Oldham, 1962, J. Chronic Dis.
n <- 100 # Set the number of values to generate
ncpv_0 <- rnorm(n, mean = 100, sd = 10) # Create 100 random numbers centered around 100
ncpv_1 <- rnorm(n, mean = 110, sd = 10) # Create another 100 random numbers, centered around 110
delta_ncpv <- ncpv_1 - ncpv_0 # Subtract the first set from the second to get the differenceNo true association
Association due to mathematical coupling
An alternative is to model NCPV at follow-up (\(\text{NCPV}_1\)) directly while adjusting for baseline NCPV. This example uses ApoB as the independent variable:
\[ \text{NCPV}_1 = \alpha + \gamma\,\text{NCPV}_0 + \beta ApoB + \varepsilon \]
This approach avoids mathematical coupling, reduces residual variance, and allows the coefficient on \(ApoB\) to reflect biological association — not algebraic structure.
You could test whether baseline NCPV predicts follow-up using a mixed-effects model with a Time × Baseline interaction:
\[ \text{NCPV}_{ij} = \alpha + \gamma\,\text{Time}_{ij} + \beta\,\text{NCPV}_{0j} + \delta\,(\text{Time}_{ij} \cdot \text{NCPV}_{0j}) + b_j + \varepsilon_{ij} \]
but the with the number of subjects this must be done carefully.
You cannot determine whether a positive or negative slope from ΔNCPV ∼ NCPV₀ reflects biology or math, because the math builds the relationship. To isolate biological effects, you must model follow-up directly with baseline as a covariate — not as part of the outcome.
Plaque begets plaque: biology or a mathematical artifact?
At best: exploratory.
At worst: misleading.
“Linear models on the primary (NCPV) and secondary outcomes were univariable”
Despite having multiple baseline predictors available
(age, sex, ApoB, Δ-ApoB, CAC, NCPV₀, LDL-C exposure),
each was tested separately in single-predictor regressions.
This modeling choice introduces omitted-variable bias:
“…omitting a relevant variable from a model which explains the independent and dependent variable leads to biased estimates.”
Wilms (2021)
When a predictor is correlated with both the outcome and another omitted variable, its coefficient may absorb the effect of that omitted factor.
They modeled:
\[ \Delta \text{NCPV} = \alpha + \beta \cdot \text{ApoB} + \varepsilon \]
But if age also predicts ΔNCPV and correlates with ApoB, then \(\beta\) is biased — it partly reflects the effect of age.
A more appropriate model would be:
\[ \Delta \text{NCPV} = \alpha + \beta_1 \cdot \text{ApoB} + \beta_2 \cdot \text{Age} + \varepsilon \]
This separates the contribution of ApoB from that of age.
Univariable Linear Models are make confounding almost certain, especially in small, non-randomized human data, undermining any claim of association or non-association.
Univariable Linear Models are exploratory and can be misleading.
“Estimated lifetime LDL-C exposure was only a significant predictor of final NCPV in the univariable analysis but lost significance when age was included as a covariate. Both age and lifetime LDL-C exposure lost significance when baseline CAC was included in the model.”
So they did use multivariable models! — but only on the follow-up NCPV, not for ΔNCPV, the paper’s main endpoint?
This selective use raises questions:
Were multivariable models used selectively for some reason? And why not on the study’s main endpoint?
Unusual. Fragile. Overstated.
“Bayesian inference adds credence to finding that there is no association between NCPV vs LDL-C or ApoB…”
The authors used Bayesian inference to support their finding that ApoB has no association with plaque progression.
This is unusual in a non-randomized, uncontrolled, 1-year observational study on a highly restricted sample:
This is a misuse of Bayesian inference; to “add credence” to a claim of no effect, based on an uncontrolled study design and simplistic modeling.
“Bayes factors were calculated … using an ~ rscale = 0.8 value of 0.8 to contrast a moderately informative prior with a conservative distribution width (to allow for potential large effect sizes”
Bayes factors compare how well the data are predicted under H₀ vs. H₁ — but H₁ isn’t just any nonzero effect; it’s a distribution of plausible effect sizes defined by the prior.
The authors used a wide, Cauchy-like prior (r = 0.8), “due to the well-documented association between ApoB changes and coronary plaque changes.”
This choice of prior isn’t wrong, it is fragile, subjective, and drives the result.
“In other words, these data suggest it is 6 to 10 times more likely that the hypothesis of no association between these variables (the null) is true as compared to the alternative.”
That is a overstatement of what the Bayes Factor tells us.
A Bayes factor of 6–10 means the data are 6–10× more likely under the null model than under the alternative model, not that the null hypothesis is 6–10× more likely to be true.
They could have said: “The data are 6–10 times more likely under the no-association model than under the alternative”
\[ \text{Posterior Odds} = \text{Bayes Factor} \times \text{Prior Odds} \]
To claim that the null is 6–10× more likely to be true, they would have to assume prior odds = 1:1 — and state that explicitly. They didn’t.
Even without other issues (e.g. confounding / non-adjusted variables, short follow-up, etc.), the reported BF of 6.3 for ΔNCPV ~ ApoB reflects only moderate evidence for no effect — not strong or decisive.
Table 1. A heuristic classification scheme for Bayes factors BF10 Source: SpringerLink
(Assuming BF₁₀ as per standard conventions; if BF₀₁, it reverses)
ΔNCPV as outcome
Regressed (NCPV₁ − NCPV₀) on baseline NCPV₀
Mathematical coupling → regression slopes reflect algebra, not just biology
Univariable regressions
Each predictor tested separately (ApoB, LDL-C, age…)
No confounder adjustment → biased, unreliable, low credibility estimates
Bayesian inference
Bayes inference used to support “no ApoB effect” & “plaque begets plaque”
Unadjusted, observational data → misleading, unusual use of Bayesian inference
Prior choice (rscale = 0.8)
Prior assumes large effects
No sensitivity analysis → results likely prior-driven
Bayes factor interpretation
Claimed null is “6–10× more likely”
Bayes factor misstated as posterior probability → compares model fit, not truth
Headline claim
“Plaque Begets Plaque, ApoB Does Not”
Overstates evidence → mathematically coupled, confounded, and fragile analysis
“Plaque Begets Plaque, ApoB Does Not”
The study regressed mathematically coupled differences in plaque volume (ΔNCPV) on baseline variables using univariable models, then made the unusual choice to use Bayesian inference (also without accounting for confounding), applied a prior without reporting a sensitivity analysis (which the strength of the Bayes Factor likely hinges on), and interpreted the Bayes Factors as evidence for no ApoB effect, culminating in a headline that far exceeds what the statistical methods can credibly support.
Predictors tested against two outcomes (Δ-NCPV, Δ-TPS) → ≥10 comparisons
Numerous regressions were run — increasing false positive risk
But no multiple testing correction was applied.
Perhaps the authors viewed this as exploratory, where correction is often skipped —
but then why title the paper “Plaque Begets Plaque, ApoB Does Not”?
Baseline NCPV median = 44 mm³; TPS median = 0 → ≥50% of values are zero
CCTA cannot report negative plaque → both outcomes are left-censored at 0
This affects not just modeling but measurement:
When true plaque ≈ 0, error is asymmetric — it can only overestimate.
ΔNCPV, their primary outcome, is a change score between two bounded, skewed measures.
Likely to produce non-normal residuals and heteroscedasticity (e.g., larger spread at higher baseline).
If smaller baseline values were also linked to larger increases,
this may reflect the effects of left-censoring and error asymmetry — not true biological acceleration.
These issues are clear in TPS, and may affect NCPV, but diagnostics are not shown.
OLS assumes homoscedastic, normal residuals
performance::check_model() was run — but no output provided
They could have considered methods to address this such as: Tobit regression, log-transform, or robust SEs